Advanced Blind helper Android Application Using Text-to-speech synthesis

Authors: A. Meena Kasthuri, Mr. Barath Kesavan

DOI Link: https://doi.org/10.22214/ijraset.2023.49861

Abstract

This project aims to develop an Android application that enables blind people to detect currency denominations and make phone calls using handwritten text recognition and voice commands. The application uses image processing techniques and a trained TensorFlow model to recognize the denomination of banknotes, and an ML model to recognize handwritten text and convert it to a phone number. The user can initiate a phone call by saying a voice command, and the application provides audio feedback throughout the process.

Introduction

I. INTRODUCTION

Blind people face several challenges in their daily lives, such as recognizing currency denominations and making phone calls. These tasks often require visual cues or touchscreen interactions, which are not accessible to blind people. To address these challenges, we propose an Android application that uses machine learning and image processing techniques to detect currency denominations and make phone calls using handwritten text recognition and voice commands.

The currency detection module uses a trained TensorFlow model to recognize the different features and patterns of banknotes, such as edges, corners, colors, and symbols. The user can capture an image of a banknote using the smartphone camera, and the application provides audio feedback on the denomination of the currency.

The calling activity module uses an ML model to recognize the user's handwritten text and convert it to a phone number. The user can initiate a phone call by saying a voice command, such as "call" or "dial," and the application provides audio feedback throughout the process.

The application aims to provide a user-friendly and accessible interface for blind people to detect currency denominations and make phone calls using handwritten text recognition and voice commands. It can help improve the independence and quality of life of blind people, enabling them to perform tasks that were previously inaccessible or challenging.

II. TOOLS AND TECHNOLOGY

Android studio

Language -: Java

Android SDKs are modules of Java code that required for accessing mobile device functions

The main component of the Android SDK is a library called Gradle to build our application.

Google Speech API is required

III. METHODOLOGY

A. Methodology

a. First I have added the required dependencies that allows us to include external library or local jar files or other library modules in our Android project. Then in the xml I have designed the user interface of the application.

In MainActivity java I have created all the methods that will help the user to open certain tasks by simple voice command.

We have also implemented a swiping touch event as given in [4] so that we have left and right swipe.

By left swiping on the screen the user will read the feature or operations of the app.

By right swiping on the screen voice input will start. After the user gives the voice command it will automatically be redirected to that particular activity. Let's say If user say “ I want to detect the currency” then it will automatically open currency activity. So that user will just tap on the screen and take the picture and know the value of currency.

b. b) Methods Used: i. Text to Speech (TTS):- TTS is a method that converts speech from text. TTS is important for voice output for voice feedback for users. TTS is implemented in software where audio capability is required. When a user enters a voice command, TTS will convert that voice into text format and perform specific action. ii.[3] Speech to Text(STT):- Android has a inbuilt feature that is speech-to-text through which the user can provide speech input to the software. In the background speech input will be converted to text and perform action in the form of TTS.

B. System Architecture

The system proposes the following applications: w

C. Project Requirements

The requirements were arranged in three groups: user interface, functional requirements.

User interface

a. Easily accessible

b. Flexibility of voice control (Set speed, pause speech)

2. Functional Requirements

Switching among the different voices

Text to speech

Voice assistant

(x) Exit - close the app.

IV. MODULE DESCRIPTION

A. Currency Detection Module

The Currency Detection module should be designed to recognize and identify the denomination of banknotes using a TensorFlow model. The model should be trained to identify the different features and patterns of the banknotes, such as edges, corners, colors, and symbols.

B. Here are the Main Components of this Module

Image Capture: The module should capture an image of the currency using the smartphone camera.
Pre-processing: The captured image should undergo preprocessing to resize, normalize, and enhance its quality.
TensorFlow Model: The module should load and run the trained TensorFlow model to recognize the currency denomination. The output of the model should be conveyed to the user in the form of an audio message or through haptic feedback.

C. Calling Activity Module

The Calling Activity module should enable the user to make phone calls using handwritten text recognition and voice commands. The module should use an ML model to recognize the user's handwritten text and convert it to a phone number, which the user can call by saying a voice command.

D. Here are the Main Components of this Module

Handwritten Text Recognition: The module should use an ML model to recognize the user's handwritten text and convert it to a phone number. The model should be trained on a dataset of handwritten digits and symbols and be able to handle different styles and variations of handwriting.
Voice Command: The user should be able to initiate a phone call using a voice command, such as "call" or "dial." The module should use speech recognition technology to recognize the user's voice command and initiate the call.
Text-to-Speech: The module should provide audio feedback to the user throughout the process of initiating a call. The user should receive audio feedback confirming that the call has been initiated and who is being called.
Error Handling: The module should be able to detect and handle errors, such as incorrect voice commands or recognition failures. Appropriate feedback should be provided to the user in case of errors.

Overall, these modules should provide a user-friendly and accessible interface for blind people to detect currency denominations and make phone calls using handwritten text recognition and voice commands.

Conclusion

At present, mobile apps on smartphones are used to perform most of our daily activities. But the people with vision impairment require assistance to access these mobile apps through handheld devices like mobile and tablets. Google, Android applications has been developing various mobile apps for visually impaired people Still it needs to provide more effective facilities in app by adopting and synergizing suitable techniques from Artificial Intelligence.[5] This report introduced two environmentally-friendly designs for a blind people. We presented information about the Blind people application. This application will be more effective for blind people. It is important to develop this application for the future. The system is used by Blind people but normal people also can use it.

References

[1] H. Nguyen, M. Nguyen, Q. Nguyen, S. Yang and H. Le, \"Web-based object detection and sound feedback system for visually impaired people,\" 2020 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), 2020, pp. 1-6, doi: 10.1109/MAPR49794.2020.9237770. [2] H. Jiang, T. Gonnot, W. Yi and J. Saniie, \"Computer vision and text recognition for assisting visually impaired people using Android smartphone,\" 2017 IEEE International Conference on Electro Information Technology (EIT), 2017, pp. 350-353, doi: 10.1109/EIT.2017.8053384. [3] Nwakanma, Ifeanyi & Oluigbo, Ikenna & Izunna, Okpala. (2014). Text – To – Speech Synthesis (TTS). 2. 154-163. [4] Wu, Xiangyu & Jiang, Yanyan & Xu, Chang & Cao, Chun & Ma, Xiaoxing & Lu, Jian. (2016). Testing Android Apps via Guided Gesture Event Generation. 201-208. 10.1109/APSEC.2016.037. [5] S. M. Felix, S. Kumar and A. Veeramuthu, \"A Smart Personal AI Assistant for Visually Impaired People,\" 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI), 2018, pp. 1245-1250, doi: 10.1109/ICOEI.2018.8553750.

Copyright

Copyright © 2023 A. Meena Kasthuri, Mr. Barath Kesavan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET49861

Publish Date : 2023-03-28

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here